190 research outputs found

    From Traditional to Modern : Domain Adaptation for Action Classification in Short Social Video Clips

    Full text link
    Short internet video clips like vines present a significantly wild distribution compared to traditional video datasets. In this paper, we focus on the problem of unsupervised action classification in wild vines using traditional labeled datasets. To this end, we use a data augmentation based simple domain adaptation strategy. We utilise semantic word2vec space as a common subspace to embed video features from both, labeled source domain and unlablled target domain. Our method incrementally augments the labeled source with target samples and iteratively modifies the embedding function to bring the source and target distributions together. Additionally, we utilise a multi-modal representation that incorporates noisy semantic information available in form of hash-tags. We show the effectiveness of this simple adaptation technique on a test set of vines and achieve notable improvements in performance.Comment: 9 pages, GCPR, 201

    Improving the Accuracy of Action Classification Using View-Dependent Context Information

    Get PDF
    Proceedings of: 6th International Conference, HAIS 2011, Wroclaw, Poland, May 23-25, 2011This paper presents a human action recognition system that decomposes the task in two subtasks. First, a view-independent classifier, shared between the multiple views to analyze, is applied to obtain an initial guess of the posterior distribution of the performed action. Then, this posterior distribution is combined with view based knowledge to improve the action classification. This allows to reuse the view-independent component when a new view has to be analyzed, needing to only specify the view dependent knowledge. An example of the application of the system into an smart home domain is discussed.This work was supported in part by Projects CICYT TIN2008-06742-C02-02/ TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/ TIC-1485) and DPS2008-07029-C02-02.Publicad

    Survey on Vision-based Path Prediction

    Full text link
    Path prediction is a fundamental task for estimating how pedestrians or vehicles are going to move in a scene. Because path prediction as a task of computer vision uses video as input, various information used for prediction, such as the environment surrounding the target and the internal state of the target, need to be estimated from the video in addition to predicting paths. Many prediction approaches that include understanding the environment and the internal state have been proposed. In this survey, we systematically summarize methods of path prediction that take video as input and and extract features from the video. Moreover, we introduce datasets used to evaluate path prediction methods quantitatively.Comment: DAPI 201

    Wave Functions, Quantum Diffusion, and Scaling Exponents in Golden-Mean Quasiperiodic Tilings

    Full text link
    We study the properties of wave functions and the wave-packet dynamics in quasiperiodic tight-binding models in one, two, and three dimensions. The atoms in the one-dimensional quasiperiodic chains are coupled by weak and strong bonds aligned according to the Fibonacci sequence. The associated d-dimensional quasiperiodic tilings are constructed from the direct product of d such chains, which yields either the hypercubic tiling or the labyrinth tiling. This approach allows us to consider rather large systems numerically. We show that the wave functions of the system are multifractal and that their properties can be related to the structure of the system in the regime of strong quasiperiodic modulation by a renormalization group (RG) approach. We also study the dynamics of wave packets to get information about the electronic transport properties. In particular, we investigate the scaling behaviour of the return probability of the wave packet with time. Applying again the RG approach we show that in the regime of strong quasiperiodic modulation the return probability is governed by the underlying quasiperiodic structure. Further, we also discuss lower bounds for the scaling exponent of the width of the wave packet and propose a modified lower bound for the absolute continuous regime.Comment: 25 pages, 13 figure

    Fusion of Single View Soft k-NN Classifiers for Multicamera Human Action Recognition

    Get PDF
    Proceedings of: 5th International Conference on Hybrid Artificial Intelligence Systems (HAIS 2010). San Sebastián, Spain, June 23-25, 2010This paper presents two different classifier fusion algorithms applied in the domain of Human Action Recognition from video. A set of cameras observes a person performing an action from a predefined set. For each camera view a 2D descriptor is computed and a posterior on the performed activity is obtained using a soft classifier. These posteriors are combined using voting and a bayesian network to obtain a single belief measure to use for the final decision on the performed action. Experiments are conducted with different low level frame descriptors on the IXMAS dataset, achieving results comparable to state of the art 3D proposals, but only performing 2D processing.This work was supported in part by Projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029-C02-02Publicad

    An Overview of Contest on Semantic Description of Human Activities (SDHA) 2010

    Full text link
    Abstract. This paper summarizes results of the 1st Contest on Seman-tic Description of Human Activities (SDHA), in conjunction with ICPR 2010. SDHA 2010 consists of three types of challenges, High-level Human Interaction Recognition Challenge, Aerial View Activity Classification Challenge, and Wide-Area Activity Search and Recognition Challenge. The challenges are designed to encourage participants to test existing methodologies and develop new approaches for complex human activity recognition scenarios in realistic environments. We introduce three new public datasets through these challenges, and discuss results of state-of-the-art activity recognition systems designed and implemented by the contestants. A methodology using a spatio-temporal voting [19] success-fully classified segmented videos in the UT-Interaction datasets, but had a difficulty correctly localizing activities from continuous videos. Both the method using local features [10] and the HMM based method [18] recognized actions from low-resolution videos (i.e. UT-Tower dataset) successfully. We compare their results in this paper

    Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions

    Full text link
    We present a comparative evaluation of various techniques for action recognition while keeping as many variables as possible controlled. We employ two categories of Riemannian manifolds: symmetric positive definite matrices and linear subspaces. For both categories we use their corresponding nearest neighbour classifiers, kernels, and recent kernelised sparse representations. We compare against traditional action recognition techniques based on Gaussian mixture models and Fisher vectors (FVs). We evaluate these action recognition techniques under ideal conditions, as well as their sensitivity in more challenging conditions (variations in scale and translation). Despite recent advancements for handling manifolds, manifold based techniques obtain the lowest performance and their kernel representations are more unstable in the presence of challenging conditions. The FV approach obtains the highest accuracy under ideal conditions. Moreover, FV best deals with moderate scale and translation changes

    Multicamera Action Recognition with Canonical Correlation Analysis and Discriminative Sequence Classification

    Get PDF
    Proceedings of: 4th International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2011, La Palma, Canary Islands, Spain, May 30 - June 3, 2011.This paper presents a feature fusion approach to the recognition of human actions from multiple cameras that avoids the computation of the 3D visual hull. Action descriptors are extracted for each one of the camera views available and projected into a common subspace that maximizes the correlation between each one of the components of the projections. That common subspace is learned using Probabilistic Canonical Correlation Analysis. The action classification is made in that subspace using a discriminative classifier. Results of the proposed method are shown for the classification of the IXMAS dataset.Publicad

    View-invariant action recognition

    Full text link
    Human action recognition is an important problem in computer vision. It has a wide range of applications in surveillance, human-computer interaction, augmented reality, video indexing, and retrieval. The varying pattern of spatio-temporal appearance generated by human action is key for identifying the performed action. We have seen a lot of research exploring this dynamics of spatio-temporal appearance for learning a visual representation of human actions. However, most of the research in action recognition is focused on some common viewpoints, and these approaches do not perform well when there is a change in viewpoint. Human actions are performed in a 3-dimensional environment and are projected to a 2-dimensional space when captured as a video from a given viewpoint. Therefore, an action will have a different spatio-temporal appearance from different viewpoints. The research in view-invariant action recognition addresses this problem and focuses on recognizing human actions from unseen viewpoints

    Efficient Spatio-Temporal Edge Descriptor

    Full text link
    corecore